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ABSTRACT 

In this paper, we present a novel low rank representation 
(LRR) algorithm for data lying on the manifold of square 
root densities. Unlike traditional LRR methods which rely 
on the assumption that the data points are vectors in the 
Euclidean space, our new algorithm is designed to incorpo¬ 
rate the intrinsic geometric structure and geodesic distance 
of the manifold. Experiments on several computer vision 
datasets showcase its noise robustness and superior perfor¬ 
mance on classification and subspace clustering compared to 
other state-of-the-art approaches. 
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1. INTRODUCTION 

Dictionary learning has been proven very effective to find 
sparse representation for high dimensional data, widely used 
in many machine learning applications, such as classifica¬ 
tion [^, recognition and image restoration |3. Un¬ 
der this model, each data point can be recovered/ or rep¬ 
resented by using a linear combination of a small number 
of atoms under a dictionary. The underlying linear process 
requires that both data points and the atoms are from a 
linear space embedded in an Euclidean space and the recon¬ 
struction error is measured using / 2 -norm. As learning a dic¬ 
tionary is quite time consuming, the data point themselves 
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are generally used as a dictionary based on the so-called 
self-representative principle [^. Given a collection of data 
points X = {xi, • • • , Xn} consisting of n m + 1-dimensional 
column vectors Xi (i — l,2,...,n), the self-representative 
coding seeks a joint sparse representation of X using data 
points themselves as the dictionary, which can be formulated 
as, 

mm\\X - XWf + X\\W\\, ( 1 ) 

vv 

where W = (wi,..., w^) is the sparse representation matrix 
under the dictionary A, each column (i = l,...,n) is 
the corresponding representation of Xi, || • ||g is the sparsity 
regularizer term for the new representation W and A > 0 is 
a penalty parameter to balance the sparsity regularizer term 
and the reconstruction error. In general, the 11 norm ||IU||i 
(i.e. the sum of the absolute values of all the element in a 
matrix) is used in favor of independent sparse representation 
for each data point, while the nuclear norm ||IU||* (i.e. the 
sum of all the singular values of a matrix) is employed to 
holistically reveal the latent sparse property embedded in 
the whole data set. 

However, in the context of machine learning and vision ap¬ 
plications, feature data acquired actually satisfy extra con¬ 
straints which make them so-called manifold-valued. Take 
diffusion weighted magnetic resonance imaging (MRI) 
as an example, this non-invasive imaging technique helps ex¬ 
plore the complex micro-structure of hbrous tissues through 
sensing the Brownian motion of water molecules. Water dif¬ 
fusion is fully characterized by the diffusion probability den¬ 
sity function called the diffusion propagator (DP) [^, which 
can be represented in the form of the square root density 
functions. More specifically, for a probability density func¬ 
tion p and its continuous square root = ^/p, we have 

f 'ip‘^ds = 1 ( 2 ) 

Js 

'ip in Eq. § can be identified as a point on the unit sphere 
in a Hilbert space by expanding it using orthogonal basis 
functions. The geodesic distance between two points along 
the unit sphere manifold is longer than the corresponding 
Hilbert distance. Accordingly, when generalizing traditional 
dictionary learning or sparse coding algorithms to a sphere 
manifold, one of the key challenges to be resolved is to ex¬ 
ploit the manifold geometry. The reason is that in Hilbert 
or Euclidean space (in the cases of finite dimension), the 
global linear structure can make sure the data synthesized 
from the atoms is contained in the same space, whereas on 
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the sphere manifold, the manifold geometry provides only 
local linear structures using the Log-Euclidean metrics [^. 

By taking advantage of this local linear property, a novel 
nonlinear dictionary learning framework is proposed for 
data lying on the manifold of square root densities and ap¬ 
plied to the reconstruction of DP fields given a multi-shell 
diffusion MRI data set [^. Under this paradigm, the loss 
of global linear structure is compensated by the local linear 
structure given by the tangent space using Riemannian ex¬ 
ponential and logarithm maps. In particular, the /i-norm 
regularization is employed to produce sparse solutions with 
many zeros. However, this sparsity scheme does not con¬ 
sider the global structure of data, which is very important 
in some computer vision tasks, especially in classification 
and clustering applications. Moreover, it is not robust to 
noise in data. 

In contrast, low rank representation uses holistic con¬ 
straints as its sparse representation condition, which can re¬ 
veal the latent sparse structure property embedded in a data 
set in high dimensional space.The lowest-rank criterion can 
enforce to correct corruptions and could be robust to noise. 
Low rank representations on non-Euclidean geometry have 
received comparatively little attention. Recently, a low rank 
representation on Grassmann manifolds has been proposed 
in by mapping the Grassmann manifold onto the Eu¬ 
clidean space of symmetric matrices. The authors of 
introduced a low rank representation for symmetric positive 
definite matrices measured by the extrinsic distance defined 
by the metric on tangent spaces and the nuclear norm simul¬ 
taneously. Some researchers exploited the local manifold 
structure of the data by adding a manifold regularization 
term characterized by a Laplacian graph or manifold 
matrix factorization [^. However, the loss functions are 
still formulated in the Euclidean space. Others generalized 
Riemannian geometry to the matrix factorization with fixed- 
rank cases 

To our best knowledge, none of the existing work is spe¬ 
cialized for the low rank representation on the manifold 
of square root densities measured by the Riemmanian dis¬ 
tance and nuclear norm simultaneously, which motivates our 
study. The contribution of this work is threefold: 

• We propose a novel LRR model on the manifold of 
square root densities. The approximation quality is 
measured by the extrinsic distance defined by the met¬ 
ric on tangent spaces and the intrinsic geodesic dis¬ 
tance on the manifold simultaneously. 

• We describe a simple and effective approach for opti¬ 
mizing our objective function as the objective function 
is a composition of a quadratic term, a linear term and 
the nuclear norm term. 

• We present results on a few computer vision tasks 
on several image sets to demonstrate our new LRR 
model’s superior performance in classihcation, clus¬ 
tering and noise robustness over other state-of-the-art 
methods. 

2. PRELIMINARIES 

A manifold A4 of dimension m is a topological space 
that locally resembles a Euclidean space in a neighbour¬ 
hood of each point x G Ad. Eor example, lines and circles are 
ID manifolds, and surfaces, such as a plane, a sphere, and 


a torus, are 2D manifolds. Geometrically, a tangent vector 
is a vector that is tangent to a manifold at a given point x. 
Abstractly, a tangent vector is a function, defined on the set 
of all the smooth functions over the manifold Ad, which sat¬ 
isfies the Leibniz differential rule. All the possible tangent 
vectors at x constitute a Euclidean space, named the tangent 
space of Ad at X and denoted by T^^M. If we have a smoothly 
defined metric (inner-product) across all the tangent spaces 
(•,-)x : TxAd X TxAd ^ R on every point x G Ad, then 
we call Ad Riemannian manifold. With a globally defined 
differential structure, manifold Ad becomes a differentiable 
manifold. The m-dimensional sphere denoted by is a 
specific Riemannian manifold, which has unit radius and is 
centered at the origin of the m + 1 dimensional Euclidean 
space. A geodesic 7 : [ 0 , 1 ] ^ Ad is a smooth curve with 
a vanishing covariant derivative of its tangent vector field, 
and in particular, the Riemmannian distance between two 
points Xi, Xj G Ad is the shortest smooth path connecting 
them on the manifold, that is the infimum of the lengths of 
all geodesics joining x^ and Xj. 

There are predominantly two operations for computations 
on the Riemannian manifold, namely (1) the exponential 
map at point x, denoted by exp^ : TxAd ^ Ad, and ( 2 ) 
the logarithmic map, at point x, log^ : Ad ^ TxAd. The 
former projects a tangent vector in the tangent space onto 
the manifold, the latter does the reverse. Locally both map¬ 
pings are diffeomorphic. Note that these maps depend on 
the manifold point x at which the tangent spaces are com¬ 
puted. Given two points Xi,Xj G Ad that are close to each 
other, the distance between them can be calculated through 
the following formula as the norm in tangent space. 

distM (xi, xj) = 11 log^. (xj) 11 xi (3) 

The squared distance function dist^(x, •) is a smooth func- 
tion for all x G Ad. 


3. PROBLEM FORMULATION 

The Euclidean LRR model uses the Erobenius norm based 
on metric to model the reconstruction error, as shown in Eq. 
0 (if q =1). However, in many real world applications, high 
dimension data have a Riemannian manifold structure, such 
as face recognition and object detection [^. In ideal 
scenarios, error should be measured according to the mani¬ 
fold’s geometry. Inspired by the LRR formulation for data 
on the Riemannian manifold of symmetric positive definite 
matrices in [^, the LRR model in Eq. 0 can be changed 
to the following manifold form: 


mm 

w 


E 


i=i 


n 

+-EE 


+ A||fy||. 


dg (x^ ;Xj- ) 

e ^ \wi^ 


*=1 


n 

s.t. y^^Wji = l,z = 1 , 2 ,... ,n 

i=i 


(4) 


where z/ > 0 is a penalty parameter, cr is a distance thresh¬ 
old (default:cr = 1 ), the i-th column of matrix IT is Wi = 
denoting the low rank representation for 
Xi, and d^(xi; Xj) denotes the geodesic distance between the 
two points on the Riemannian manifold. 








There is an intuitive explanation for the formulation in 
Eq. For each point on the manifold, Xj can be pro¬ 
jected onto 0 tangent vector of tangent space at x^, other 
points Xj are projected as log^,(xj). The norm in the first 
term of Q means the distance between 0 tangent vector 
and the linearly combined tangent vector from all the other 

dg(x^-,Xj) 

projected tangent vectors. The weight term e for 

sparse representation is used to avoid assigning large weights 
to the far-away points in the manifold’s embedding space 
(see further explanation in Figure]^ a)). Therefore, mini¬ 
mization aims at finding an appropriate linear combination 
for the “best” approximation in terms of tangent vectors, 
and meanwhile, effectively avoiding sparse representations 
via far-away points that are unrelated to the local manifold 
structure. 

The difference between the Euclidean LRR and our pro¬ 
posed method is illustrated in Figure Given a toy MR 
brain image dataset of 6 people from young and old groups 
respectively, the goal is to classify these sample images ac¬ 
cording to their age group. To start with, we extract a his¬ 
togram vector for each image, the square root of which lies 
on a finite dimensional sphere manifold. Traditional LRR 
simply regards the histograms as conventional feature vec¬ 
tors, and measures data distance using Euclidean geometry 
as shown in Figure [^b). In contrast, our proposed method 
uses the square root of the histogram on the Riemannian 
manifold as input features, and the distance between input 
feature vectors on the manifold is approximated by the Eu¬ 
clidean distance between their corresponding mapped points 
on the tangent space, as shown in Figurea). In Figure 
we note that two points on the manifold having a large 
geodesic distance separating them may be near each other in 
the Euclidean space, which demonstrates that the important 
intrinsic properties of the data manifold cannot be captured 
using the Euclidean geometry. However, using Riemannian 
exponential and logarithm maps alone may fail to detect the 
local structure at point x in the manifold setting, because 
two far-away points may have a short distance on the tan- 

dg{Xi-,X^) 

gent space. Accordingly, we use a term e ^ to adjust 
weight values, thereby increasing the effect of nearby points 
in addition to their sparsity. 

When the underlying Riemannian manifold is the square 
root densities in the m-dimensional space the geodesic 
distance dg between two points x, y G is simply the great 
circle distance between two points, which is defined formally 
as 

dg (x; y) = arccos(x^y) (5) 

where arccos : [—1,1] ^ [O^^r] is the usual inverse cosine 
function. 

The tangent space at x G is 

Tx(K”*+^) = {zeK”*+^ :z^x = 0} (6) 

Then the geodesic started at x in the direction v G Tx(S"^) 
is given by the formula 

y (t) = cos(t)x + sin(t) (7) 

and the exponential and logarithm maps are given by 

expx(v) = cos(|v|)x + sin(|v|)^ (8) 

logx(y) = u arccos(x^y) / Vu^u (9) 


where u = y — (x^y)x. 

Hence the LRR on the manifold of unit sphere is defined 
by 




w 2 


i=l j = l 


vy«cos-^((x„x,»^ +A||Vy||* 


+'^EE® 

*=1 


dg (x^ ;x„- ) 


( 10 ) 


s.t. Wji = 1, ^ = 1,2,...,'; 
i=i 


where Uij = Xj — (xi,Xj)xi. The metric here is the inner 
product inherited from the standard inner product on R"^ 
given by 

(^,ry)x = ^^ry (11) 

Accordingly, the LRR problem can be re-written as 

^ n n 

W E WjiWki cos ^((Xi,Xj)) cos ^((Xi,Xfc)) 


mm 
w 2 


uTjUik 


i=l j = l,k = l 


*=1 


Uij\ • \Uik\ 

dg (x^ ;Xj- ) 




\ 


s.t. Wji — 1, i = 1, 2 ,..., -n 
j=i 

( 12 ) 

where |uij| = y/l - {:sLi,:sLjy and uJjUik = (x^,Xfc)-(xi,x_^-)(xi,x^). 
Problem (|12|) can be reformulated into the following problem 


^ E+ A||1K||* + ^ y] e 


dg (x^ ;x„- ) 


w 2 


where 


i=i j^i 
n 

s.t. = = 

i=i 


(13) 


Qi = 


cos ^((Xi,Xj)) cos ^((Xi,Xfc)) 


U^-Uifc 
|Uij| • |Uifc| 


is a n X -n matrix, and the (j, k)th element of Qi is 
cos~ ^ ((Xi, Xj }) cos~ ^ ( (xi, Xfc } ) . Kfc I • 


4. SOLUTION TO LRR ON THE MANIFOLD 
OF SQUARE ROOT DENSITIES 

In this section, we consider an algorithm to solve the con¬ 
strained optimization problem in Eq. ( |13| . We propose 
to use the Augmented Lagrange Multiplier (ALM) method 
^T] to solve it. The reason we choose the ALM to solve this 
optimization problem is threefold: (1) superior convergence 
property of ALM makes it very attractive; (2) parameter 
tuning is much easier than the iterative thresholding algo¬ 
rithm [^; and (3) it converges to an exact optimal solution. 

First of all, the augmented Lagrange problem of (13) can 





















Figure 1: The illustration of distance metrics used in our proposed method and Euclidean LRR methods: the purple curve 
denotes the manifold geodesic distance between data points x and y, the black line denotes their corresponding distance on the 

tangent space. Without the weight term e ^ , the weight between x and y would be assigned a large value. Therefore, 

this weight term is used to reduce the weights between far-away points. 


be written as 





3^^ 


dg (x^ ;Xj ) 

e ^ 


\wji\ 


n 

+ yiC^Wji - 1 ) 

3=1 


j = l J 

(14) 

where yi are Lagrangian multipliers, and /3 is a weight to 
tune the error term of Wji — 1)^. 

In fact, the above problem can be solved by updating one 
variable at a time with all the other variables fixed. More 
specihcally, the iterations of ALM go as follows: 

1) Fix all others to update W: We dehne a function F(W) 


where dF(W^^^) is a gradient matrix whose z-th row is given 
by 


QzW, 


(k) 




dg(x-X^) 


+ -\)1 (18) 

j = l 


Taking Eq.(18) into Eq.([T4|, we have 


=argmnF(VF^''^) + {dF{W^'^'^),W - 
+ P±l\W-W'''"'>\\l + X\\W\U (19) 

= argmn^||VF- - -^dF{w‘''‘'>))\\ 

+ M\w\U 


by 


FiW)=Y 

i=l 


-wfQiWi + 1/^^ 


dgix^-,X^) 


The above problem admits a closed form solution by using 
n SVD thresholding operator to A = — -^dF{W^^^). 

+ ~ 1) Taking SVD for A = , then the new iteration is given 


3 = 1 


by 



3 = 1 

and it is easy to prove that 


(15) 


dF 

dwi 




dg (x^ ;Xj) 

e 


3 7^i 


FyAF l^(^^Wji — 1)1 (16) 
3 = 1 


where 1 G 'MF is a column vector of all ones. Now at the 
current location we take a linearization of F(W)^ 


F{W) «F(ll/('')) + {dF{w''F. W - 

+ ^\\W 


( 17 ) 


= [/soft(E, ^)F^ (20) 

A 

where soft(E, a) — max{0, (E^^ — ^)} is the soft thresholding 
operator for a diagonal matrix, see . 

2) Fix all others to update yi by 

vF ^ Vi + ~ 1) (21) 

3 = 1 

After a number of iterations by alternately updating W and 
yi respectively, we achieved a complete solution to LRR on 
the manifold of square root densities. The whole procedure 
of LRR on the manifold of square root densities is summa¬ 
rized in Algorithm^ 


























































































Algorithm 1 Solving Problem ([T^ by ALM 


Require: The square root densities sample set{xi}^=i,E 
penalty parameters A and z/, 
and a distance threshold cr =1, the initial W. 

Ensure: : The low rank representation W 
1: initialize: pf = 0 (i = = 10“®,^max = 

10^°,p= 1.1,e= 10“®. 


2: u, 
3: Q 


^ :sij - (xi,Xj)xi for z, j = 1, 2,..., n; 


cos ^((Xi,Xj)) cos ^((Xi,Xfc)) 


lUziMUifcl 


z, j, /c = 1 to rz; 

do(xi;x,) 


for 


Eq.0; 


while I zcji - 1| > £ do 

W is computed by solving the problem (19) using SVD 
thresholding operator; 
yi is updated by ( [^ . 

[3 ^ min(p^, 
end while 


4.1 Convergence Analysis 

The convergence of ALM algorithm has been guaranteed 
by Theorem 3 in [^. To ensure the convergence of the 
Algorithm we only need prove that problem (13) is a 
convex optimization problem. 

Theorem 1 Each Qi is a semi-positive definite matrix, 
thus problem (13) is a convex optimization. 


by 


Proof. The jk-lYi element qjj. of Qi can be represented 


g}fe=cos ^({xi,Xj))cos ^({xi,xfc)) 


ufjUjk 
Uij I ' I I 


= COS ^((Xi,Xj))-^COS ^((Xi,Xfe)) 

I ^ij \ I U-ifc 


where Uj = cos ^ ((x^, x^)) . 

If we define a new matrix Vi by 


( 22 ) 


Ei = (23) 

then we can see Qi = V^Vi, which means Qi has a decom¬ 
position as the product of a matrix with its transpose, hence 
Qi is semi-positive definite. 


4.2 Algorithm Complexity 

For ease of analysis, we assume that x^ is a m-dimensional 
square root densities sample, and the iteration number of 
ALM algorithm is s. The complexity of Algorithm can 
be decomposed into two parts: the data preparation part 
(Steps 1-4) and the ALM solution part (Steps 5-9). 

In the data preparation part, u^ and || x^ — Xj ||2 for 
(z,j = l,...,rz) can be computed with a complexity of 
0(rz^?TZ^). The bottleneck of the data preparation proce¬ 
dure is the computation of all Qi. As we could store the 
inner product values of x^ and Xj computed in Step 2, the 
time complexity of computing Qi should be O(n^). As 
Qi = V^Vi, to break this complexity bound, we use a paral¬ 
lel matrix multiplication scheme to reduce the complexity 
to O(nlogn). Therefore, the complexity of data preparation 
is 0{ii?im?) + O(nlogn). 

In the ALM, the major computation cost is for SVD of an 
n X n matrix in Step 6 with a complexity of O(rz^). Fortu¬ 


nately, we can utilize the accelerated method in to re¬ 
duce the complexity of partial SVD computation to O(rn^), 
where r is predicted rank of For s iterations, the 

complexity of ALM solution is O(srrz^). 

With the above analysis, the overall complexity of Algo¬ 
rithmic is given as 

0(n^m^) + O(rzlogrz) + 0{srn^) (24) 

As 7TZ ^ rz, the complexity of Algorithm^mainly depends on 
the size of data set n. It can be approximated as 0(n^?7z^) + 
0{srn‘^), which is similar to the complexity of the original 
LRR algorithm 0{srn^). 


5. EXPERIMENTS 

To evaluate the proposed LRR model on the manifold of 
square root densities, we apply it to both clean and cor¬ 
rupted image datasets for image classification and image 
segmentation. 

To apply our method for image classification, we firstly 
employ our proposed LRR model to obtain the low rank fea¬ 
tures, and the classification accuracy is computed by train¬ 
ing a SVM classifier on the low rank features. In order to 
compare performances with the state-of-the-art methods, we 
use the following four baseline methods: 

• SVM on vectorized data: we directly vectorize the 
manifold data to form their Euclidean features and 
train an SVM using these Euclidean features. 

• LRR +SVM: we firstly apply the Euclidean LRR model 
in on the Euclidean features (vectorized the man¬ 
ifold data) to obtain the low rank features, and then 
train an SVM classifier on the low rank features. 


• GKNN: it is a K-nearest neighbour classifier that uses 
the geodesic distance on the manifold for determining 
neighbours, and solves the classification problem on 
the manifold directly without LRR transforms. 


• SC+SVM: it is a variant of our proposed method within 
the same framework. The only difference between SC 
+SVM and our proposed model is that h regulariza¬ 
tion on W is used in the objective function Eq.(I3). 


All SVMs used in the experiments are trained using the 
LIB SVM package [^. 

To apply our method for subspace clustering, we compute 
the low rank features using the proposed method, and then 
use these low rank features as the input of the Ncut clus¬ 
tering algorithm. The baselines used in the experiments are 
listed as follows: 


• Ncut [^: we simply conduct the Ncut clustering al¬ 
gorithm over the vectorized manifold data. 


• LRR+Ncut: we use the Euclidean LRR method on the 
Euclidean features to obtain the low rank features, and 
then apply Ncut to the low rank features. 

• Geodesic Distance based Ncut (GNcut): it is a variant 
of the traditional Ncut method, based on the Rieman- 
nian geodesic distance introduced in this paper as the 
node similarity measure. 


• SC+Ncut: it is similar to our proposed method, except 
1 1 regularization on W used in the objective function 
Eq.jfe). 

















5.1 Performance for Image Classification 

In this experiment, we evaluate the performance of our 
proposed method in terms of image classification using OA- 
SIS and LUMAR a datasets. Sample images from both 
data sets are shown in Figure 

The OASIS dataset consists of Tl-weighted MR brain im¬ 
ages from a cross-sectional collection of 416 subjects aged 
18 to 96. The subjects are all right-handed and include 
both men and women. Each MRI scan has a resolution 
of 176x208 pixels. The whole OASIS images are catego¬ 
rized into three groups: young subjects (younger than 40), 
middle-aged subjects (between 40 and 60) and old subjects 
(older than 60). We aim to classify each MRI image into its 
corresponding group. We note that the subtle differences in 
anatomical structure across different age groups in Figure, 
[^are apparent. 

The LUMBAR dataset collects MRI scans of human lum¬ 
bar spine from 60 subjects, who are randomly sampled from 
normal people and patients with lumbar degenerative dis¬ 
ease groups respectively. Each image has a resolution of 
643 x574 pixels. The classification problem is to recognize 
the abnormal lumbar spines from the normal ones. Two 
sample images from each group in the LUMBAR dataset 
are shown in Figure. 

To generate the square root densities data, we use some 
pre-processing techniques on the above two MRI datasets. 
First of all, we obtain a displacement field for each MRI 
image in the dataset using a nonrigid group-wise registration 
method described in [^. Then we compute the histogram 
of the displacement vector for each image as the feature 
for classification. In our experiment, the number of bins in 
each direction is set to 4, and the resulting 16-dimensional 
histogram is generated as the feature vector for the SVM 
and LRR+SVM methods. The square root of the histogram 
is used in the GKNN, SC+SVM and our new method. 

The classification results on the above two datasets are 
reported in Table It shows that LRR+SVM outperforms 
SVM, which indicates that low rank features are more dis¬ 
criminative for classification than the original high-dimensional 
features. As GKNN considers the intrinsic property of the 
data manifold, its performance is superior to the counter¬ 
parts using extrinsic Euclidean metric (SVM and LRR+SVM), 
which confirms the importance of intrinsic geometry. As we 
expected, our method and SG+SVM utilizing both intrinsic 
geometry and sparse feature transform (i.e. sparse coding 
and low rank representation) outperform all the other base¬ 
lines. The performance of our proposed method also outper¬ 
forms its variant method SG+SVM under the same frame¬ 
work, which demonstrates that the lowest-rank criterion is 
accurate for modeling the embedding manifold structures of 
high dimensional data. 

5.2 Performance on Subspace Clustering 

In this section, our proposed method is used to solve the 
subspace clustering problem. We evaluate our model on 
4 hyperspectral images, including pavia centre scene [10] , 
salinas-A scene [^, samson and jasper ridge [^ . 

The Pavia centre scene was acquired by the ROSIS sensor 
during a flight campaign over Pavia, nothern Italy. It is a 
1096x1096 pixels image, with 102 spectral bands. The geo¬ 
metric resolution is 1.3 meters, and the image groundtruths 
differentiate 9 classes. 

The Salinas-A scene is a small subscene of salinas image. 


Table 1: Glassification Accuracy comparisons on Glean data 
sets. For the OASIS dataset, there are three binary classi¬ 
fications (YM: Young vs Middle-aged, MO: Middle-aged vs 
Old and YO: Young vs Old), and one three-class classifica¬ 
tion (YMO: Young, Middle-aged and Old). For the LUM¬ 
BAR dataset, there is one binary classification (NA:Normal 


vs Abnormal). 


Datasets 

OASIS 

LUMBAR 

Glasses 

YM 

MO 

YO 

YMO 

NA 

SVM 

89.05 

92.58 

93.22 

91.36 

87.32 

LRR+SVM 

90.18 

93.24 

94.57 

92.58 

89.79 

GKNN 

91.75 

95.59 

96.88 

94.49 

92.95 

SG+SVM 

94.56 

96.77 

97.25 

96.17 

95.11 

Proposed 

97.25 

99.88 

99.97 

99.87 

98.8 


which was collected by the 224-band AVIRIS sensor with a 
spatial resolution of 3.7-meter pixels. The salinas-A scene 
consists of 86x83 pixels located within the salinas scene at 
[rows, columns] = [591-676, 158-240] and includes six classes. 

The Samson consists of 952x952 pixels, each of which has 
156 channels covering the wavelengths from 401 nm to 889 
nm. The spectral resolution is highly up to 3.13 nm. To 
reduce the computational cost, a sub region of 95x95 pixels 
is used, starting from the (252,332)-th pixel in the original 
image. There are three targets in this image, i.e., soil, tree 
and water respectively. 

The Jasper ridge image has 512 x 614 pixels, with 224 
channels ranging from 380 nm to 2500 nm. The spectral 
resolution is up to 9.46nm. Since this hyperspectral image is 
too complex to get the ground truth, we consider a subimage 
of 100 X 100 pixels, starting from the (105,269)-th pixel 
in the original image. In our experiments, the channels 1- 
3, 108-112, 154-166 and 220-224 are removed due to dense 
water vapor and atmospheric effect. There are four classes 
in this data: road, soil, water and tree. 

In order to extract the square root densities information 
embedded in the original hyperspectral images, we compute 
the histogram of the band values for each pixel. In our 
experiment, the number of bins is set to 6, so the result¬ 
ing 6-dimensional histogram vector is used as the pixel fea¬ 
ture vector for the Ncut and LRR+Ncut methods, while 
the square root of the histogram vector is used in GNcut, 
SG+Ncut and our new method. 

The subspace clustering results are detailed in Table 
Obviously, the performance of Ncut is inferior to that of 
LRR+Ncut, as the low rank features are more discriminative 
and useful than the data themselves for clustering problems. 
Similarly, the clustering accuracy of GNut is 3% lower than 
that of our proposed method on average. This is mainly be¬ 
cause GNcut directly works on the original high-dimensional 
manifold data, without further exploring the latent subspace 
structure hidden in the manifold data. As GNcut, SG+Ncut 
and our proposed model use the Riemannian distance mea¬ 
surement, they perform better than Ncut and LRR+Ncut, 
in which the Euclidean distance is used to measure the sim¬ 
ilarities between points on the manifold. We can assert that 
taking inherent manifold structure into account can effec¬ 
tively improve clustering accuracy. We also note that our 
proposed method has the highest accuracy among all the 
baseline methods, which suggests that the combination of 
Riemannian distance and LRR model brings good accuracy 















Figure 2: Sample Images for classification: From left to right, there are two sample images from the LUMBAR dataset 
belonging to the degenerative lumbar spine and standard lumbar spine groups, respectively, and three sample images from 
the OASIS dataset belonging to the Young, Middle-aged and Old groups, respectively. 


Table 2: Subspace clustering accuracy on hyperspectral im¬ 
ages 


Dataset 

Paviacenter 

SalinasA 

Samson 

Jasperridge 

Classes 

9 

6 

3 

4 

Ncut 

76.85 

78.59 

84.52 

83.75 

LRR+Ncut 

79.21 

81.23 

87.76 

85.02 

GNcut 

83.56 

85.97 

90.89 

88.46 

SC+Ncut 

85.01 

86.33 

93.61 

90.05 

Proposed 

88.67 

88.99 

97.01 

94.71 


sparse representation is also integrated into the model to 
avoid assigning large weights to the far-away points on the 
manifold. Furthermore, we derive an easily solvable opti¬ 
mization problem, which incorporates the structured em¬ 
bedding mapping and the intrinsic geodesic distance on the 
manifold into the LRR model. Our experiments demon¬ 
strate that our proposed method is efficient and robust to 
noise, and produces superior results compared to other state- 
of-the-art methods for classification and subspace clustering 
applications on computer vision datasets. 


for Ncut clustering. To visualize our proposed model’s effec¬ 
tiveness in subspace clustering, we illustrate the clustering 
results of the above 4 images in Figure [3pl 

5.3 Performance on Corrupted Data 

To further examine the noise robustness of the proposed 
model, we add Guassian white noises with different signal- 
to-noise ratio (SNR) values to the above 6 baseline datasets 
used for both classification and subspace clustering tasks. 
Figure illustrates the performances on all the methods, 
with SNR values ranging from 0.8 to 12.8. Unlike SC re¬ 
lated methods (SC+SVM and SC+Ncut) having comparable 
performances on the clean datasets, their performances de¬ 
teriorate when data are corrupted. In contrast, LRR based 
methods, no matter whether designed in the Euclidean space 
or on the manifold space, always have stable performances 
with different SNR values, which shows that LRR is more 
noisetolerable and robust than SC. Moreover, the geodesic 
distance based baselines like GKNN and GNcut outperform 
their linear Euclidean counterparts (i.e. LRR+SVM/Ncut, 
SVM/Ncut). This demonstrates that exploring data spatial 
correlation information in the intrinsic Riemannian geome¬ 
try helps to boost prediction performance. We also observe 
that our new method performs best among all the meth¬ 
ods on the corrupted data sets with different SNR values, 
confirming the importance of intrinsic geometry and sparse 
feature transform again. 

6 . CONCLUSIONS 

In this paper, we propose a novel LRR model on the man¬ 
ifold of square root densities, in which we exploit the in¬ 
trinsic property of the square root densities manifold in the 
Riemannian geometric context. Compared with the exist¬ 
ing Euclidean LRR algorithms, the loss of the global linear 
structure is compensated by the local linear structures given 
by the tangent spaces of the manifold. A weight term for 
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Figure 4: Subspace Clustering results on Jasper Ridge 







Figure 5: Error rate comparisons with different SNR values on 6 datasets. 
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